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Abstract 



O , A distributed adaptive algorithm to estimate a time-varying signal, measured by a wireless 

sensor network, is designed and analyzed. One of the major features of the algorithm is that 
no central coordination among the nodes needs to be assumed. The measurements taken by the 
nodes of the network are affected by noise, and the communication among the nodes is subject 
to packet losses. Nodes exchange local estimates and measurements with neighboring nodes. 
Each node of the network locally computes adaptive weights that minimize the estimation error 
. variance. Decentralized conditions on the weights, needed for the convergence of the estimation 

^\ ' error throughout the overall network, are presented. A Lipschitz optimization problem is posed 

, to guarantee stability and the minimization of the variance. An efficient strategy to distribute 

• the computation of the optimal solution is investigated. A theoretical performance analysis of the 

distributed algorithm is carried out both in the presence of perfect and lossy links. Numerical 
simulations illustrate performance for various network topologies and packet loss probabilities. 



J> ' Keywords: Distributed Estimation; Wireless Sensor Networks; Parallel and Distributed 

. Computation; Convex Optimization; Lipschitz Optimization. 

^ '■ 1 Introduction 



Monitoring physical variables is a typical task performed by wireless sensor networks (WSNs). Accu- 
rate estimation of these variables is a major need for many applications, spanning from traffic control, 
industrial manufacturing automation, environment monitoring, to security systems [l]-[3]. However, 
nodes of WSNs have limitations, such as scarcity of energy supply, lightweight processing and com- 
, munication functionalities, with the consequence that sensed data are affected by bias and noise, and 

^ ' transmission is subject to interference, which results in corrupted data (packet loss). Estimation 

5^ , algorithms must be designed to cope with these adverse conditions, while offering high accuracy. 

There are two main estimation strategies for WSNs. A traditional approach consists in letting 
nodes sense the environment and then report data to a central unit, which extracts the desired physical 
variable and sends the estimate to each local node for local action. However, this approach has strong 
limitations: large amount of communication resources (radio power, bandwidth, routing, etc.) have to 
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be managed for the transmission of information from nodes to the central unit and vice versa, which 
reduces the nodes' Hfetime. An alternative approach, which we investigate in this paper, enables each 
node to locally produce accurate estimates taking advantage of data exchanged with only neighboring 
nodes. Indeed, wireless communication makes it natural to exploit cooperative strategies, as it has 
been already used for coding and transmission [3, 4]. The challenge of distributed estimation is that 
local processing must be carefully designed to avoid heavy computations and spreading of local errors 
throughout the network. 

In this paper we consider the design and analysis of a distributed estimation algorithm. Specifically, 
a time- varying signal is jointly tracked by the nodes of a WSN, in which each node computes an estimate 
as a weighted sum of its own and its neighbors' measurements and estimates. The distributed estimator 
features three particular characteristics: it is robust to packet losses, it does not rely on a model of 
the signal to track, and it uses filter coefficients that adapt to the changing network topology caused 
by packet losses. We show that the estimation problem has a distributed implementation. It is argued 
that the estimator exhibits high accuracy, in term of estimation error variance, even in the presence 
of severe packet losses, if the signal to track is varying slowly. 

1.1 Related Work 

The estimator presented in this paper is related to recent contributions on low-pass filtering by diffusion 
mechanisms, e.g., [5] [12], where each node of the network obtains the average of the initial samples 
collected by nodes. In [13, 14] the authors study a distributed average computation of a time-varying 
signal, when the signal is affected by a zero-mean noise. Distributed filtering using model-based 
approaches is studied in various wireless network contexts, e.g., [11], [15]-[18]. In particular, distributed 
Kalman filters and more recently a combination of the diffusion mechanism with distributed Kalman 
filtering have been proposed, e.g., [19]. 

In [20], wc have presented a distributed estimator to track a time- varying signal without relaying 
on a model of the signal to track, in contrast to model-based approaches, e.g., [11, 16]. The approach 
is novel, since [8]-[10] are limited to averaging initial samples. Compared to [10]-[14] and [17]-[19], 
we do not rely on the Laplacian matrix associated to the communication graph. Our filter param- 
eters are computed through distributed algorithms which adapt to the network topology and packet 
losses, whereas for example [13] and [14] rely on centralized algorithms for designing the filters. The 
distributed estimator proposed in this paper features better estimates when compared to similar dis- 
tributed algorithms presented in the literature, but at the cost of a slightly increased computational 
complexity. With respect to our earlier work [20] , here we provide a major extension because we take 
into account lossy wireless communications. Packet losses require a substantial redesign and perfor- 
mance characterization of the filter proposed in [20]. In this paper, we explicitly consider the effect of 
packet losses (both i.i.d. and non-identical) in the design of the adaptive weights so that the estimation 
error is guaranteed to converge for any packet loss realization. The distributed minimum variance es- 
timator uses a Lipschitz optimization problem to distribute the centralized stability constrains, whose 
characterization is completely new. We devise a new algorithm to distribute efficiently the computation 
of the solution of the Lipschitz problem. An original analysis of the bounds on the estimation error 
variance as function of the packet loss probability and number of nodes is also discussed. Wc introduce 
some examples to show how such bounds can be refined significantly when the network topology is 
modelled by a graph in the class of finite Cayley graphs. 

The remainder of the paper is organized as follows: In Section 2 we review and extend the problem 
posed in [20] considering packet losses. In Section 3 we deal with the design of the optimal adaptive 
weights that minimize the estimation error variance. In Section 4 we show how to compute efficiently 
some thresholds needed to bound the norm of the estimation error. In Section 5 we determine bounds 
on the estimation error variance achieved by the proposed algorithm. Monte Carlo simulations are 
reported in Section 6 to illustrate the performance of the proposed algorithm. Conclusions are drawn 
in Section 7. 
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1.2 Notation 



Given a stochastic variable x, Ex denotes its expected value. With E yX we mean that the expected 
value is taken with respect to the probability density function of y. We keep explicit the time depen- 
dence to remind the reader that the realization is given at time t. With || • || we denote the .^^-norm of 
a vector or the spectral norm of a matrix. Given a matrix A, ^„j(A) and £m{A.) denote the minimum 
and maximum eigenvalue (with respect to the absolute value of their real part), respectively, and its 
largest singular value is denoted by 7(A). Given the matrix B, A o B is the Hadamard (element- wise) 
product between A and B. With a ^ b and a ^ b denote the element-wise inequalities. With I and 1 
we denote the identity matrix and the vector (1, . . . , 1)^, respectively, whose dimensions are clear from 
the context. Let Wq = IN U {0}. To keep the notation lighter, the time dependence of the variables 
and parameters is not explicitly indicated, when this does not create misunderstandings. 

2 Problem Formulation 

Consider a WSN with > 1 sensor nodes. At every time instant, each sensor in the network takes 
a noisy measure Ui{t) of a scalar signal d{t), namely Ui{t) = d(t) 4- Vi{t), for t G INq and for all 
i = 1, . . . ,N. We assume that Vi{t), for all i, are normally distributed with zero mean and variance cr^ 
and that Eu,(t)uj(t) = for all i e INq. 

We model the network as a weighted graph. In particular we consider a graph, 0{t) = (V, £), where 
V = {1, . . . , N} is the vertex set and f C V x V is the edge set. 

The set of neighbors of node i e V plus node i is denoted as 

M = {jeV:(i,i)ef}u{i}, 

Namely A/i is the set containing the maximum number of neighbors a node i can have, including itself. 

Every node broadcasts data packets, so that these packets can be received by any other node 
in the communication range. Packets may be dropped because of bad channel conditions or radio 
interference. Let 4>ij{t), with i 7^ j, be a binary random variable associated to the packet losses from 
node i to j at time t [21]. This random variable is defined on the probability space (r2,JF, Pr), where 
= {0, 1}, is a CT-algebra of subsets of and Pr(.) a probability measure. For i ^ j, we assume 
that the random variables ^ij (t) are independent with probability mass function: 

PT{4>ij{t) = 1) =Pij , 
Pv{^ij{t) = 0) = Qij = 1 - Pij , 

where pij G [0, 1] denotes the successful packet reception probability. Clearly that pu =- 1, since 
information locally available is not subject to packet losses. Note also that pji = if the packet sent 
from node j to i collided at node i due to too much wireless interferences, or if the wireless channel of 
the link from j to i is under deep fading, or if i is too far from node j to receive packets from node j. 
The packet reception probabilities are assumed to be independent among links, and independent from 
past packet losses. These assumptions are natural when the coherence time of the wireless channel is 
small if compared to the typical communication rate of data packets over WSNs [21, 22]. 

We assume each node i computes an estimate xi (t) of d(t) by taking a linear combination of its own 
and of its neighbors' estimates and measurements. Define Xi(^) = {x\{i)^ . . . ,XM{t)Y s-i^d similarly 
Ui(t) = (ui(t), . . . ,UN{t))^, then each node computes 

Xi{t) = {ki<i>if{t) Mt - 1) + 0^i<l>if{t) Ui(i) , (2.1) 
with Xj(0) = Ui(0), and where 

{k,<f>,){t) = kj{t) O 0f(t) = (fcl(t), fc2(t), . . . , kN{t)f O {Mt),Mt): . • . , CPN{t)f ■ 

with k^(t) G X 1, in which the j-th element is the weight coefficient used by node i for information 
coming from node j at time t, and 0i(t) G x 1 denotes the vector of the packet reception process 
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as seen from node i with respect to all nodes of the network. Specifically, the jth element of <Pi(t), 
with j ^ i, be (i>ij(t). Let ip^ G fi^ denotes a realization of the process at time t. Notice that 

at a given time instant, the j-th component of <^j|( is zero if no data packets are received from node 
j. Let Mtp. = {j € A/i : Vij\t 7^ 0}, namely a such set collects the nodes communicating with node i at 
time t. The number of nodes in the set is \Mip. \ = <pJ^j<Pj|j. 

The vector (hj<^J(i) e M^^^ and ^ {t) G R^^^ are constructed from the elements hij{t), similarly 

to (ki0.)(t). 

3 Distributed Minimum Variance Estimator 

In this section we describe how each node computes adaptive weights to minimize its estimation error 
variance. 

3.1 Estimation Error 

Define the estimation error at node i as ej(t) = Xi{t) — d{t)l. Introduce 6{t) = d{t) — d{t — 1), then 
the expected error with respect to the measurement noise Vj(t) = {vi{t), . . . jV^it))"^ is given by 

E vei(t) = 0^i(pif{t) E^Biit - 1) + d{t) {0^i(pif{t) 1 + {hi(Pif{t) 1 - 1) - 6{t){]^i<Pif{t)l , (3.1) 

where we set ei{t) = (ei(t), . . . ,ejv(t))'^. 

Typically one is interested in designing an unbiased estimator. Notice, however that in (3.1) the 
expected error depends on both unknowns d{t) and 6{t). The following condition eliminates the 
dependence from d{t): 

{kiCl,,f{t)l + M,f{t)l = l, (3.2) 

for any possible reahzation of the packet loss process </>j(t). Note that (3.2) holds both in the presence 
of packet i.i.d. losses , and in the presence of non-identical losses. The term S{t) can be removed by 
imposing that 

(ki0,)^(t)l = O, (3.3) 

for any possible realization of the packet loss process (pi{t). By imposing constraints (3.2) and (3.3), the 
unknown terms would disappear from the expected error equation, so the minimum variance estimator 
would be such that (k,0^)^(t) = and {hi<f)^)'^ (t) = l/\Af^.\l, where jA/",^. | is the number of neighbors, 
including node i, that are successfully communicating with node i at time t. Wc; will show in the next 
sections that by imposing only constraint (3.2) we are able to design an estimator that has lower 
variance than one that also obey (3.3). The price paid for better performance is a biased estimator. 
However, assuming that d{t) is slowly varying (or that the sampling frequency is high enough with 
resect to the variation of the signal), the bias is negligible and the proposed estimator outperforms the 
unbiased one in terms of the estimation error variance. This can also be understood from an intuitive 
point of view: having (\s.ilp^)'^ (t) = means that nodes arc disregarding previous estimates and arc 
just using current measurements, which are typically corrupted by high noise. Having a term that 
also weights previous estimates allows us to increment the total available information at each node, 
obtaining a much lower estimation error variance. 

Notice that although (3.2) holds, and thus d{t) is eliminated from (3.1), we need further conditions 
on vector {ki(f)^){t) in order to guarantee that the expected error asymptotically decreases. 

3.2 Convergence of Estimation Error 

In this subsection wc derive conditions on the weights that ensure that the estimation error decreasing 
over time, regardless the measurement noise and the packet loss processes that affect the system. In 
particular, we want to determine conditions on the weights so that | E,^ E vei(f)| ^ as t — > +00. 
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Assume that constraint (3.2) holds, and consider the expected value with respect to the packet loss 
process of (3.1), then 

E^E,ei(t) = E^(kf (t) o 0.(i)^) E^Evei(t - 1) - <5(t) E^(hf (i) o 0.(t)^) 1 , 

where we have used the fact that the packet losses at time t are independent from the preceding time 
instants, so that E ^{kf (t) o (p^{t)'^ W,^ei{t- 1)) = E ^(kf (f) o 0.(t)^) E <^ E vei(< - 1). It is clear that 
the evolution of E^Evei(f) depends on the overall error vector e,(f — 1), namely, the error at the 
local node depends on the estimation error of neighboring nodes. We thus need a set of other \J\fi \ — 1 
equations to describe the estimation error of all nodes in A/i . Obviously, each new equation will depend 
on the estimation error of nodes that are two hops from node i, and so on. The full network will be 
considered in this process of adding equations. Let e{t) G be the estimation error of the overall 
network, then 

E^Eve(f) = E^(K(f)o$(f))E^E„e(t-l)-5(t)E^(H(f)o$(f))l, (3.4) 

where K.{t) = (ki(t), \i.2(t), . . . , \s.N{t))'^ is the matrix whose rows are the vectors kf (t), i = 1, . . . ,N, 
and <f>(t) = {cj)i{t), 4>2(t)j • • • I 4>N{'t)) is the matrix whose rows are the vectors (pfit), i = I, . . . ,N. 

Let be a realization of $(f) at time t, namely (p^^. = ¥'2|ti • • • ' Vjv|t) • The following result 

holds: 

Proposition 3.1. Consider the system (3.4) and assume that 

(i) 7(K(f)o(p|j) < 7niax < 1 for allt and for each and every packet loss realization ip\^^ of^{t). 

(it) \S{t)\ < A for allte Mq. 
Then, considering independent packet losses, we have that 



lim ||E^Eve(i)|| < (3.5) 

t^+oo 1 — 7i„ax 

Proof. For the sake of notational simplicity, define s{t) = E<^Eve(f). The dynamics of s{t) are thus 
given by a deterministic time- varying linear system. Consider the function V{s{t)) = \\s{t)\\. Simple 
algebra gives that 

V{s{t)) < II E^(K(t) o ^t))\\V{s{t - 1)) + II E^(K(t) o ^t))\\AVN. 

Now, consider that 

E^(K(i)o$(i))= J2 (K(i)°<Pit)Pr(V|t), 

where Pr((p|() is the probability of the packet reception realization at time t. The expectation is 
given by the sum of a finite number of combinations S of possible packet loss realizations, since the 
network has a finite number of links, and in each link a packet can be either successfully received or 
dropped, so that |E| = 2l^l. It follows 

||E<^K(f)o$(f)|| = 11 ^ (K(Oo<p|,)Pr(<p|,)|| < J2 l|K(0o<P|,||Pr(¥>|,)<7^ax, 
because obviously J2(fi^^e e ^^iflt) — 1> ^'^d, from assumption (i), ||K(f) o (p^^\\ < 7max < 1- Therefore 

V{s{t)) < 7Lx^(s(i - 1)) + 7ma.^^T^Av^ , 

7max 

from where, taking the limit t — > +oo, the proposition follows. □ 
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Remark 3.2. Notice that the expected error converges to a neighborhood of the origin exponentially 
fast, and more precisely with rate 7max < 1- 

Proposition 3.1 provides us with conditions for the convergence of the estimation error of the entire 
network to a neighborhood of the origin. It follows that the estimation error of the entire network is 
subject to a cumulative bias. It is clear that such a bias depends on ||K(<) o ^fi^^\\ and A. If the signal 
d{t) is slowly varying, namely A -C 1, and ||K(t) o (p^^\\ small, then the bias will be small. 

Notice that, in order to ensure that the estimation error decreases at each node, a condition at 
network level is required, namely it must hold that 7(K(t) o (p^^) < 7i„ax < 1- The constant 7max can 
be chosen by fixing a maximum cumulative estimation error and solving (3.5) for 7max, as we show in 
Section 6. 

We will show in the next subsection how to choose the weights ki(t) and hi(t) locally at each node 
so that the estimation error variance is minimized. 

3.3 Distributed Computation of Filter Coefficients 

To design a minimum variance distributed estimator, we need to consider how the error variance 
evolves over time. The estimation error variance dynamic at node i is given by 

Ev(e,(t) - Evei(t))2 = (k,0J^(i)Pi(t - l)(ki0J(O + a2(hi0,)^(O(hi0,)(i) (3.6) 

where 

Pi{t-1)= E(ei(t)- Eei(i))]E(ei(t)- Eei(t))^eR^>^^. 
We assume that Xi(0) = Ui(0), and 

ei{0) = Ui{0) - py^pY E ' 

because an initial (rough) estimate of d{0) can be computed as arithmetic average of the \Af,f.\ — 1 
measurements received from neighboring nodes. 

The estimation error variance (3.6) can be rewritten as 

Ev(e,(t) - E ve,(t))2 = kj{t) (p,(t - 1) o {<p^{t)ci>J (t))) k,(t) + a^hf {t)ci>,{t)<pf {t)hj (t) 

where we used the fact that a^Ba = tr (Baa"^) and tr (A o B)C = tr (C^ o B)A'^. 

The optimal weights ki(f) and hi{t) arc chosen so that at each time instant, for any given realization 
of the packet loss process tp^lt of 4>i{t), the variance Ev{ei{t) — Wjvei{t)\<p^{t) = Vi|t)^ is minimized, 
under the constrain (3.2) and that j(K{t) o ip^^) < jy^ax < 1- As already mentioned, the second 
constraint is global, since K(t) o ^p^^ depends on all k,;(t), i = 1. . . . ,N. However, it is possible to 
determine local conditions so that the global constraint is satisfied. We show next how this can be 
done. 

For i = 1, . . . , A^, we define the set 0^. ~ {j i '■ M^p. n N^p . 7^ 0} U {M^.}, which is the collection 
of communicating nodes located at two hops distance from node i plus communicating neighbors of i, 
at time t. The following result holds: 

Proposition 3.3. Suppose there exists il){t) = {i^i{t), i'2{t), . . . , V'Ar(i))"^ >- 0, such that 

Mt) + V^ \l^o{t) < 7ma^ Vi. (3.7) 

// \\ki{t) o < V>i(t), i=l,...,N, then 7(K(i) o ip^ < 7^^, < 1. 



6 



Proof. The proof is similar to the proof of Proposition III.l in [20]. □ 

Using this proposition, the global constraint 7(K(t) o < 7max < 1 can be replaced by the 
constraint ||kj(t)o(^.|j|p < where satisfies the set of nonlinear inequalities (3.7). Therefore, 

each node needs to solve the following optimization problem 

k,(S(t) ^^^^^ ~ ° (<^»|tV^t)) ki W + cj^\ij {t)iPi^tVlMt) (3-8) 

S.t. ((ki(i)+hi(t))^0(^,|,) 1 = 1 

||ki(i)oV'i|tll'< V'i(i)- 

We will discuss in Section 4 how to compute the values of tpi{t), which are needed to state problem (3.8). 
Observe that the optimization problem (3.8) is a Qiiadratically Constrained Quadratic Problem [23, 
pag. 653]. It admits a strict interior point solution, corresponding to kj(t) = and (hj(t) o (p^^^) 1 = 1. 
Thus Slater's condition is satisfied and strong duality holds [23, pag. 226]. The problem, however, 
docs not have a closed form solution and thus wc need to rely on numerical algorithms to derive the 
optimal ki(t) and h.i{t). We have the following proposition. 

Proposition 3.4. For a given covariance matrix P,(t — 1) and a realization (p^ of (t>i{t), the values 
ofki{t) andhi{t) that minimizes (3.8) are 

{{Pi{t-i)+Mt)i)°fiitv>j;tf'Pi\t ^^^^ 

(^((P,(t - 1) + A,(t)I) o <p,^,cpiy+ a-n^ Vi|t ' 
Kit) = ^- ^ -J ^ , (3.10) 



'^V^t (((Pi(t - 1) + Ai(i)I) ° <Pi\t<Pl)^ + ^-'i) fiit ' 



with the optimal Lagrange multiplier Xi{t) G [O, max (O, c^/ ■\/|A/^JV^(0 ^ ^m{Pi{t — 1)))] • 

Proof. The proof is similar to the proof of Proposition III. 2 in [20]. □ 

Remark 3.5. Modeling the packet loss by the Hadamard product allows us to obtain weights hav- 
ing a similar form to those we obtained in the case of no packet loss [20]. However, this result is 
not a straightforward application of [20] because (3.9) and (3.10) are obtained by exploitation of the 
Hadamard product and the Moore-Penrose pseudo-inverse in the computation of the Lagrange dual 
function and the KKT conditions. Therefore, the previous proposition generalizes our earlier result for 
any given realization of the packet loss process. In the special case when ip^ = 1, namely when there 
are no packet losses, we reobtain the result in [20]. 

Previous proposition provides us with an interval within which the optimal Aj is located. Simple 
search algorithms can be considered to solve numerically (kj)^kj — tpi = for Ai, such as, for example, 
the bisection algorithm. 

It is worth noting that ktj, and similarly hij, in (3.10) are zero if node j does not communicate 
zero. The pseudo-inverse {(Pi{t — 1) + Xi{t)T) o (pm<fj^^)^ maintains the zeros in the same position as 



with node i because of a lost packet. In such a case the j-th row and column of the matrix cp^^^ipj^^ is 
zero. The pseudo-inverse ((F 
those in the matrix (pmipj,^. 



3.4 Error Covariance Matrix 

Proposition 3.4 provides us with the optimal weights that minimize the estimation error variance at 
each time step. The optimal weights kj(t) and h.i{t) depend indirectly on the thresholds 'i[i{t), through 
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the Lagrangian multiplier Xi{t), and directly on the error covariance matrix Fi{t — 1). We will discuss 
in the next section how it is possible to compute such thresholds ipi {t) in a distributed way, whereas we 
dedicate the rest of this subsection on discussing how to locally compute the error covariance matrix 
Pi{t — 1). More precisely, because of the packet loss process, node i requires only the elements of 
Pj(i — 1) corresponding to its neighbors, namely the matrix Pi{t — 1) o (p^f.(pj^^. 

Each node can estimate from data the error covariance matrix, which we denote with Ti{t), as 
discussed in [20]. However, here we need to extend the approach to the case of packet losses, because 
the design of the estimator of the covariance matrix is tricky when packets are lost. If a node j 
exchanges data with its neighboring node i, after an outage period, node i needs to re-initialize the 
j-th row and column of T'i{t) reasonably in order to take advantage of the new acquired neighbor. We 
consider the following re-initialization of elements of the error covariance matrix Ti{t). 

If at time t a new neighbor of a node is exchanging data, then the diagonal element of the estimate 
of the error covariance matrix at time t — 1, corresponding to such a neighbor, is initialized to the 
maximum element in the diagonal of the error covariance matrix. More precisely, let j S J\fi and 
assume that for t G {ti,t2), j ^ ^Vi\ti ' ^ ^ -^'fiit2 ' Then 



and for i G Afi, 



Tiit2 - 1) 



ti{t2 



max Ti{t2 - 1) 



Tiit2 - 1) 



:= 0. 



This heuristic is motivated by the fact that all nodes are collaborating to build and estimate of d{t), 
and they are using the same algorithm. Thus the maximum variance of the estimation error that a 
neighbor of a node is affected by must not be larger than the worst variance of the estimation error of 
other neighbors. Obviously, chances are that the heuristic might overestimate the variance associated 
to a new neighbor. However, from simulations in Section 6 we see that this strategy works well in 
practice, even in the presence of high packet loss probabilities. 



4 Computation of the Thresholds 

From Proposition 3.3, we notice that thresholds Vi's need to be upper bounded to guarantee conver- 
gence of the estimation error. It holds that the larger the value of tpi{t) the lower the error variance. 
Indeed, after some algebra, it follows that 

2 

Ev(e,(i) - Eve.(i)|0,W = fut)' < ^ r • (4.1) 



^V^, (((P(i - 1) + A. (i)I) ° fiit'Pl)'' + <^-'i) 'Put 



From this inequality we have that the estimation error variance at the node i decreases as Xi{t) 
decreases. From Proposition 3.4 wo sec that if ipi is large, then the Lagrangian multiplier Aj is small, 
since Ai e [O, max {O, / y/\N,^.\ijji - £m{Ti))]. 

According to the arguments above, we are interested in determining the largest solution of the 
nonlinear equations in Proposition 3.3. Therefore, we consider the following optimization problem: 

max l^V(i) (4.2) 

V(f) 

s.t. S{tp{t)) ^ (4.3) 
where S(V'(t)) = {Si{tp{t)), . . . , 5^(V(i)))^ and 



5,(V'(t)) = v^w + \/v^ J2 \/v^-7n 
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The solution of previous problem can be computed easily via standard centralized approaches, but in 
our setup the computation of the solution must be obtained in a decentralized fashion. The distributed 
computation of the solution could be performed through message passing, as in [24]. However, the 
converge speed is prohibitive. Hence, we consider an alternative approach. The fact that in (4.3) only 
information from two-hop neighboring nodes is required, and not of the entire network, allows us to 
develop a decentralized algorithm to compute the optimal solution. This is obtained in two steps. First 
we show that the optimal solution satisfies the inequality constraints (4.3) with equality. Second, we 
build on this to distribute the computation among nodes to obtain the optimal solution. We provide 
details in the sequel. 

4.1 Equality constraints 

In this section, we show that there is a global optimal solution of (4.2) that satisfies the inequality 
constraints (4.3) with equality. In particular we have the following important result. 

Theorem 4.1. Problem (4.2) admits a global optimum '4>*{t), which is the solution of the following 
set of nonlinear equations: 

+ E ^/^-^— = ^ = i,-..,iv, (4.4) 

where Q^^ = {j ^ i : N^^ n 7^ 0} U {7V<^J. 

To prove this theorem, we need some intermediate technical results: 

Lemma 4.2. There exists a feasible solution ip^{t) = {tpi{t), . . . ,ipi{t), ■ ■ ■ ,4'n{'^))'^ > ^ of (4.2), 
where 

V'f(i) = ^(V|e^,P+4-|e^j)' ^=l,...,N. (4.5) 

Proof. The i-th element of i/j^, t/^l, is constructed by considering the ith constraint, and imposing that 
the other variables xpj, for j € 0<p., assume the largest value, which is 7max: 

i^i + -\A^|0vJ\/7max = 7max • 

By solving the equation for Vi = V'f we obtain 

^f = ^(V|0v^J^+4-|e^j)'. (4.6) 

The same procedure can be repeated Vi = 1,...,A'^. The V'f obtained are collected into a vector 
t/>^ = (Vf,...,Vf,---,^^^)^- Since 

ip^ is a feasible solution. □ 

This lemma is useful, because it allows us to establish the existence of an optimal solution: 

Lemma 4.3. Problem (4.2) admits an optimal solution ip*{t), which is the solution of the following 
set of nonlinear equations: 

+ E V^/?W-7max = i=l,...,N. 
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Proof. The proof is based on a useful rewriting of the optimization problem and by a reductio ad 
absurdum argument. 

Let yf = tpi fov i = 1, . . . , N . Then, the optimization problem (4.2) can be rewritten as follows 

max y^y (4.7) 
y 

s.t. y - f (y) < (4.8) 
y^o. 



where f (y) = (/i(y), . . . , /jv(y))^ and 

/i(y) = yi - P [yf + yi vj - 7max 

with (3 being any positive scalar. This problem and (4.2) are obviously equivalent: for all /? > 0, 
^ if and only if y — f(y) ^ 0. Let y* be an optimal solution of (4.7), then ip* = y*^. 
Problem (4.7) admits optimal solutions, since from Proposition 4.2 the problem is feasible. We show 
next that the optimal solutions satisfy the constraints at the equality. 

Let y* be an optimal solution. Suppose by contradiction that there is constraint i that is satisfied 
at a strict inequality, namely y* < /i(y*), while suppose y* < fj{y*) for i ^ j. In the following, we 
show that from y* we can construct a feasible solution t* such that t*"^t* > y*^y*, so that it is not 
possible that y* be an optimal solution. 

Since /3 is arbitrary, we can select a convenient value. Let 

P < < min ^ ^ 



o^y^i 2yi + Ejee^. VJ 2 + e<^. ' 

This choice of /3 makes fi{y) being an increasing function of yi, and a decreasing function of yj, for 
j ^ i- Indeed 

VJi(y*) = l-/3 l2y*+ J2 vA >0' 



V f^v*^-/ -/5y*<0' ifjee^^ 



and 



V^A(y*) = -2/5<0, 
V,V^(y*) = if j^i, 

Let z e R-'^ such that Zi e (0, 1]. We have 

/,(z) = My*) + V/i(y*)(z - y*)^ + i(z - y*fV'My*){z - y*) , (4.9) 

because the third order derivatives are zero. Then, we chose a small positive scalar < e < /i(y*) — y* 
so that z be an augmented vector of y*, with Zi = e + y*, zj = for j = 1, . . . ,N, j ^ i, and 
Zi = y* + s < fi{y*) < fi{^)- The last inequality is allowed by the fact that fi{y) is an increasing 
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function of j/j. From (4.9) it follows 



/i(y*) + A/,, 



1-/3 2y*+ J2 



/,(z) = /,(y*) + V./,(y*)e + -e'Vffjiy*) = /,(y*) - Py*e 

^/,(y*)-A/,- if jee^,, 
fe{z) = fi{y*) otherwise . 

By using e and Afj, j ^ i, we can define a vector t* such that t* = y*+s, t* = y* —Afj if j £ 0,^., and 
= y* otherwise. Notice that f(z) < f(t*) since t* ^ z. The solution t* is feasible for problem (4.7), 
namely t* ^ f (t*), because t* = y* + e = < /,(z) < /,(t*), t* - y* - Af, < /^(y*) - A/, = /,-(z) < 
/j(t*) if j e 6<^. and t} = y} < fe{y*) = fe{t*) iil^i and I ^ 6<^.. Now, observe that 

JV JV 

The last right-hand side of previous equation is always positive, provided that one chooses 



£ < 



This implies that t*"^t* > y*"^y*, namely that t* is a feasible solution of (4.7) with higher cost fimction 
than y*, which is a contradiction because y* was assumed to be an optimal solution. It follows that 
optimal solutions must satisfy all the constraints at the equality. □ 

The previous lemma guarantees that there are optimal solutions satisfying the constraints at the 
equality. However, we do not know yet if there is a global optimal solution. If there were multiple 
optimal solutions, we would have to chose the most fair for all nodes. Recall that a small (f) means 
smaller estimation quality. To establish the uniqueness of the optimal solution, we need the following 
lemma, which will be used for the proof of Theorem 4.1: 

Lemma 4.4. Let J{'ip{t)) = VS{'ip{t)) be the Jacobian ofS{'ip{t)). Then J{'ip{t)) is a nonsingular 

m,atrix. 



Proof. The diagonal elements of the Jacobian are 



whereas the off-diagonal elements ViSj{'ip) are either zero if j ^ Oj^., or 



2v^ 



if j e e^^ 



By applying the Gershgorin theorem, we have that the eigenvalues of the Jacobian lie in the region 



JV 



4(J(V'))e \J{z€€:\z-Jum<J2\'^Ji\} 



i=l 
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from which it follows that the real part of the minimum eigenvalue is such that 



1 




1 



Therefore, J(V') has no zero eigenvalues, namely it is non-singular. 



□ 



We are now in the position of proving Theorem 4.1. Prom Lemma 4.3, we know that there is an 
optimal solution satisfying the constraints at the equality. We show next that such a solution is unique, 
thus proving Theorem 4.1. 

Proof of Theorem 4-1- The proof of the uniqueness of the optimal solution is based on the use of the 
Lagrange dual theory. First, observe that from Lemma 4.3 the optimization problem admits optimal 
solutions. The optimization problem is non-convex, since the constraints (4.3) are not convex. The 
Lagrange dual theory for non-convex non-linear optimization problems can be applied. A qualification 
constraint from [25, pag. 25] states that strong duality holds if the optimization problem is feasible and 
the Jacobian of S{xp{t)) is non-singular, which we know from Lemma 4.2 and Lemma 4.4, respectively. 
Tlierefor(\ the optimal solution of the problem can be investigated via the Lagrange dual function 
H^,'^) = --0^1 + ^^S(i/>), where ^ ^ is the La grangian multiplier. From the KKT conditions it 
follows that 3{ip)'^$, = 1. W c sec that previous equality trivially holds also for the optimal solution 
■0*, namely J('0*)"^^* = 1. Since from Lemma 4.4 the Jacobian is non-singular, it follows that there 
is a unique solution to the previous system of equations, namely ^* = J('j/j*)^"^1, and since strong 
duality holds, we conclude that the optimal solution given by (4.4) is unique. □ 

Corollary 4.5. Let'tp*{t) be the solution of 4-4- Then, ip*{t) y 'tp^{t), where ip^{t) is given by (4.5). 

Proof. The simple proof is by contradiction. Suppose that is not a lower bound on the optimal 
solution tp*, namely there is some i for which tjj* < tl)f. By observing that that tp* ^ 7maxl, it follows 



which is a contradiction, because the optimal solution must satisfy the constraints of 4.2 at the equality. 



We use Theorem 4.1 and Corollary 4.5 in the next sections to develop a strategy for the distributed 
computation of the optimal solution. 

4.2 Distribution of the Computation 

Prom the previous section, we compute the thresholds to use in (3.8) by the system of nonlinear 
equations (4.4). Unfortunately, an explicit solution for such a system is not available. Numerical 
techniques have to be used. In the following, we present a quick decentralized algorithm with certified 
convergence. 

We define the class of functions parameterized in the scalar pi 



the fixed point of the mapping is the solution of (4.4) [26, Pag. 191]. Furthermore, the convergence 
speed can be tuned at a local node i by the parameter pj. We have the following result 




□ 



(4.10) 




When R(i/') is contractive, then it is easy to show that 
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Proposition 4.6. Let 



pm = { 



if 1 



otherwise . 



1 miA'^x- V^^'(^) 



(4.11) 



Then tpilk + 1) = Ri{ip{k)) = ipiik) — Pi{k)fi{ip{k)) is a contraction mapping having the largest 
convergence speed among the mappings (4.10). 

Proof. Proposition 1.10 in [26, Pag. 193] gives a sufficient condition to establish that (4.10) is a con- 
traction mapping. If 



1 > a^{k) = |1 - p,{k)WiM^P{k))\ + J2 \p^{k)VjM^{k))\ 



1 - Pi{k) 1 + 



+ ..,1.1 V 



where Vj is partial derivative operator with respect to ipjik), then (4.10) is contractive. The scalar ai{t) 
determines the converge speed of the mapping, so that the lower is ai{k) the faster is the convergence. 
Suppose that 



1 + 



(4.12) 



Then, ai{k) is minimized if 



Piik) = 



1 + 



Suppose that (4.12) does not hold, than ai{k) is minimized if Pi(fc) = 0. By putting together these 

cases, the proposition follows. □ 

From previous proposition, the overall mapping ip = R(i/')) where R : M,^ is a contraction 

mapping. The component solution method [26, Pag. 187] can be applied. The solution of (4.2) is given 
by the algorithm 



iP{k + 1) = R(V(fc)) = {Ri{ip{k)), . . . , RwrnkW • 



(4.13) 



Using the p*{k) given by Proposition 4.6, the mapping converges quickly. From Monte Carlo simula- 
tions, we see that the algorithm converges in less than 10 iterations on average. 



4.3 Algorithm for the Computation of the Thresholds 

The distributed computation of the thresholds ijji requires that the neighboring nodes communicate 
the instantaneous values of the local threshold, until (4.13) converges. Clearly, the thresholds are 
over the same wireless channel used for broadcasting estimates and measurements, and thus they arc 
subject to packet losses. These losses may happen during the phase between the beginning of the 
iterations (4.13) and the convergence. As a result, no convergence may be reached. In the following, 
we develop a strategy to cope with this problem. 

First, notice that the optimization problem is not sensitive to perturbations of the constraints. In 
other words, if tp*{t) is the solution of the system of non linear equations (4.4), then tp*{t) is not 
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significantly perturbed by paeket losses. We ean see this from the proof of Theorem 4.1, form where 
we know that the optimal solution is such that J{tp*)'^^* = 1, with J{tp*) being the Jacobian of the 
constraints and ^* the Lagrange multipliers associated to the dual problem of (4.2). Specifically, the 
i-th equation of J(i/'*)^^* = 1 is given by 

+ ™ E v^)+ E ^;^ = i ^ = i,--,^- (4.14) 

This system of equations has positive coefficients, and ^1 + 1/(2a/V^) Sjee \/^) ^ ^iiice 
^* >: 0, for strong duality holds, it follows that ^* < 1 for i = 1, . . . , A^. Then, ^* < 1 implies that the 
optimal solution is not sensitive to perturbations of the constraints [23, pag. 249]. 

Since a change in the muuber of two-hops neighbors of a node, caused by packet losses, can be 
regarded as a perturbation of the constraints, we conclude that the optimal solution of the problem (4.2) 
is not much sensitive to the packet losses. By this argument, we can compute just once the optimal 
solution. In particular, we assume that the nodes compute the optimal thresholds before the estimation 
algorithm starts by considering the maximum number of neighbors. This is accomplished by using high 
transmission radio powers and a retransmission protocol that guarantee a successful packet reception. 
Such a preliminary phase is very short, since from Proposition 4.6 the computation of the thresholds 
according to (4.13) requires few iterations to converge. During the estimation phase, if the packet 
loss probability is very high, the perturbation might be large, resulting in a significant change of the 
optimum. However, simulations reported in Section 6 show that the solution we adopt for the threshold 
computation is robust to rather intense packet losses. 



5 Performance Analysis 

In this section we characterize the performance of our estimator by investigating the variance of the 
estimation error. We have the following results: 

Proposition 5.1. For any packet loss realization tp^^ of <p^{t), the optimal value o/ki(t) and hi{t) 
are such that the error variance at node i satisfies 

E„(e?-Ee?|0,(t)=vp,|,)^< 



Proof. From (4.1) the error variance is upper-bounded by 

2 2 

lEv(ei(t) - E^ei{t)\<Pi{t) = <Pi\tf < : n < 



where the inequality comes from the fact that (pf^^ (^(P(,t — 1) + Ai(i)I) o ip-^^ipj^^^ ip-^^ > 0, and re- 
calling that fj^fip^t = |A/'v,J. □ 

Notice that previous proposition guarantees that the estimation error at each time t, and in each 
node, is always upper-bounded by the variance of the estimator that just takes the averages of the 
received Ui(t). 

Proposition 5.2. For any packet loss realization (p^^ of (f)(t), 

iu ((P(. - 1) + A.(,)i) o < (i + ^^_^-;"^ ) . 
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Proof. It holds that Cm ((P(t - 1) + Ai(i)I) o iPiitVJit) = ((P(t - 1) o 'P^fl^ + \{t). Recalling 
that \i{t) e [0, max (O, (^'^/-s/WvM^ ~ ^mC^ii* - 1)))] follows 



0-2 , 2 2Ar 



Xi{t) < : < < cr 



where previous inequality comes from Corollary 4.5, observing that (i/jQc^J^ + 4 — |0(^.|) > {V5 — 
l)/|©vj > (V5 - 1)/A^, and that \J\f^^\ > 1. Furthermore, Im ((P(i - 1) o fMipf {t)) can be upper- 
bounded using the Gervshgorin theorem, and recalling that the diagonal elements of P(t — 1) o ip^^^ipT^^ 
are less than l\M^\ from Proposition 5.1, and that each diagonal element of a covariance matrix 
assumes the largest value along its row. Hence, it follows that Im (^(P(i ^ 1) ° ViliVfit^ — ^"^^ Putting 
together previous inequality, and the upper bound on Xi{t), the proposition follows. 

□ 



Lemma 5.3. 



IMI-i 

^4<t>f<t>r'= E (5-1) 

k=0 



where 



x{k) = E n • n Pi'i"') ' (^-^^ 

e=l \n=l m=fc+l J 

and the function s : {1, 2, . . . , |A/i| — 1} ^ {1, 2, . . . , jA/ij — 1} is a permutation. Namely the k-th 
coefficient of the polynomial is the sum of ('■'^*;!~^) terms in which there are k factors qij and |Ai | — 1 — fc 
factors pir with j ^ r. 

Proof. The random variable = SjLi ^ij = 1 + Yl^j^'i j^i^ij is given by the sum of — 1 

independent Bernoulli random variables having different parameter. Then, we have [27] 



gi{z)dz, (5.3) 







where gi{z) is the probability generating function of (jij^i- 

|M| IMI 

gi{z) = E^-^"*^-! = n Ez^- = Hfei = Xo + Xi^ + • • • + X|M|-i^'^''"' 

where the last equality is achieved by developing the product of terms + PijZ in a polynomial in 
the general form. After tedious manipulations, we see that the coefficients of the polynomial are given 
by (5.2). By using gi{z) in the integral (5.3), we obtain the result. □ 

Proposition 5.4. For any packet loss realization ip^^ of cl)^{t), it holds 

E,E.(e.-E.eJ<^^^_^^^^^^. g M) 
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Proof. The variance at node i is bounded as in (4.1). Following the same steps as in the proof of 
Proposition 5.1, we have 



E^lE^(e,2- E^e^) < E,^ ^ . (5.5) 

aVf ((Pit - 1) + Ai(i)I) ° 0i0r) 0i + ^l4>i 

This inequality is based on the fact that the argument of the statistical expectation is always positive 
and the expectation is taken over a positive distribution, thus the sign of the argument is main- 
tained [28, pag.392]. From Proposition 5.2, it follows 



<pj ((p(t - 1) + x^m o )^ 0, > 



1 



2N 



(v^-l)V7n 



By using previous inequality in (5.5), we have 



E^E^e^- E^e^) < /.^ E, 



2(\/5-l)7w + 2A^ "^r^i' 
The proposition follows by invoking Lemma 5.3. □ 

Observe that the estimation error variance given by the previous proposition depends on the packet 

loss probabilities qtj, on the maximum number of neighbors for each node {Mil- the total number of 
nodes in the networks N, and the largest singular value of the matrix K(t). In Figure 1 we have plotted 
the first factor of the coefficient of (5.4). It turns out that it is always less than 1. The smallest values 
are achieved when 7inax is large and N small. The second factor in (5.4) clearly depends on the value 
attained by the various qij. We consider here the simple case when qij = q for all which allows us 
writing the equations in closed form: 

In Figure 2 we have plotted such a function for various values of q and \J\fi\. The function decreases 
very fast as the maximum number of neighbors of a node increases, for all values of q (notice that 
we have considered that q = 0.3 at most, namely a packet loss probability of 30%). This is rather 
intuitive, since as the number of neighbors increases packet losses have less impact on the estimation 
and thus better performance are achieved. Notice also that the value of the function (5.6) for g = 
is l/|jVi|. Thus in presence of non-identical packet loss probabilities the degradation in performance 
is not remarked. In particular even when the first factor of (5.4) is very close to 1, if the number of 
neighbors is greater than 2, with a packet loss ofq = 0.3 we have that the product of the two coefficient 
does not exceed 0.65 and it is only a 30% higher than the case when no packet losses are present. 

Corollary 5.5. Consider as benchmark the estimator computing the estimates by the instantaneous 
average of the available measurements, namely the estimator for which the weights are chosen to be 
kij = and hij = l/\M,p.\, for all i = 1,.. .,N, and j G A/'v'i- Then, limj^+oo E„ei(t) = and the 
variance is 



E,E„e?=E^-^ = a2 J] (5.7) 



From this corollary we see that the difference in the expected performance between the proposed 
estimator, given by (5.4), and the unbiased estimator that does an arithmetic average, given by (5.7), 
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Figure 1: First factor of (5.4) as function of 7max for increasing values of N ranging from 2 to 100. 
The factor is always less than 1 for all the values of the parameters. The smallest values are achieved 
when 7inax is large and TV is small. 




q 



Figure 2: Second factor of (5.4) as function of q for increasing values of \Afi\ ranging from 1 to 20. 
The factor is always less than 1. The smallest values are achieved when q is small and \Afi\ is large. 
This is explained by the fact that the packet loss probability has a decreasing negative effect when the 
number of neighbors of a node increases, which translates into a smaller value of the coefficient. 
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(a) 



(b) 



Figure 3: On the left an example of Cayley graph Q = {G, S) defined on the group G — Z15 and 
S — {0, ±1, ±3, ±4}. On the right a plot that shows the first coefficient of (5.4) as function of 7max 
and V when the network topology is described by a Cayley graph. Furthermore, 7max G [0.5, 1) and 
v G [0, 10]. As it can be seen, the coefficient is always less then one, and, compared to Figure 1, it 
depends only on ly, namely the degree of a node and not on the network size N. 

is on the first coefficient of (5.4). Clearly, the proposed estimator outperforms the latter as the factor 
in (5.4) is always less than one, as shown in Figure 1. 

However, the bound (5.4) has been derived by Proposition 5.4, where the cardinality of the set O,^. 
is bounded by N. Obviously, this is in general a very conservative bound. The set 8,^. depends on 
the network topology, and no tight bound can be derived unless some assumptions are given on the 
network topology itself. We will show next that when we assume information on the network topology, 
we are able to bound \Qcp^ \ more accurately. This further underlines the improvement of the proposed 
estimator provides with respect to the benchmark estimator of Corollary 5.5. 

Example 5.6. Consider a simple line-graph. Let i be a node at the extreme of the line-graph, then we 
have that \Qip.\ < 2. Let j be a node of the line- graph different from the extremes, then we have that 



By assuming that 7max G [0.5, 1), we see that the coefficient in (5.4) is at most 0.76 for the border nodes 
and 0.708 for those in the middle, regardless the packet losses. Thus we have a significant improvement 
with respect to the estimator that takes the average of the measurements. 

Example 5.7. Consider the family of finite Cayley graphs defined on a finite additive Abelian group 
(C?, +), with G — {0,1,...,|V| — 1} = N, namely the elements of the group can be regarded as the 
labels of the nodes. The operator + is considered as addition modulo N . Let us consider S C G, such 
that Cz S and S is closed under the inverse, namely if a Cz S , then —a G S . Two nodes, i and j , 
communicate if and only if (i — j) mod N d S . Thus if G = {0, 1,2,..., N} and S = {0, ±1, ±3, ±4}, 
then we have a graph in which node i has as neighboring nodes those with label i ± 1 mod N , i ± 3 
mod N and i ± 4 mod N. In Figure 3a a Cayley graph is shown, where G = Z15. 

We have that each node communicates with \S\ — 1 = v nodes. In other words, two distinct nodes 
have in common at most v nodes. This implies that |0i^. | <2v Thus 



\ < 3. Thus 



v/|e^J' + 4-|e^,|> 



{ 



V22 + 4 - 2 0.828, if£ = i; 
V32 + 4- 3 « 0.606, ifi^j. 
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The first coefficient of (5.4) as function of "/^i^x cind v is shown in Figure 3b. Notice that the function 
is similar in shape to that in Figure 1, however in this case the dependence is on v, and not on the 
total number of nodes in the network N. Albeit the network might have a total number of nodes many 
orders of magnitude larger than v, the coefficient stays well below the one in (5.4), which has been 
obtained when no information about the network is known. 

Since the coefficient of (5.4) is always much less than one, the designed estimator outperforms 
significantly the estimator that takes the average of the measurements. 

6 Simulations and Numerical Results 

Numerical simulations have been carried out to compare the estimator proposed in this paper with 
some related estimators available from the literature. 
We consider the following five estimators: 

El'. K(f) = H(i) = (I — L(t))/2 where L(i) is instantaneous Laplacian matrix associated to the graph 
Q. Clearly the graph changes when packets are dropped, so that arcs disappear from the graph. 

E2. K(f) = and H(t) = [/lij(t)] with hij = l/\J\f^.\ if node i and j communicate, and hij = 
otherwise. Thus, the updated estimate is the average of the measurements (this is the estimator 

of Corollary 5.5). 

E3: K.{t) = [kij{t)], where kii{t) = 1/21^^,^.1, kij = l/\J\fip. \ if node i and j communicate, kij{t) = 
otherwise, whereas H(t) = [/iy(t)] with hu = l/2\J\f^.\, and hij = elsewhere. This is the 
average of the old estimates and node's single measurement. 

E4: K(t) = H(t) with kij{t) = l/2\Af^.\ if node i and j communicate, and kij{t) = otherwise. The 
updated estimate is the average of the old estimates and all local new measurements. 

Epi The estimator proposed in this paper. 

The estimators _Ei,...,i?4 arc based on various heuristics. They arc related to proposals in the 
literature, e.g., Ei uses filter coefficients given by the Laplacian matrix, cf., [10]-[12] and E2 and -B3 
are considered here as benchmark. Observe that the weights based on Laplacian do not ensure the 
minimization of the variance of the estimation error. 

Figure 4 shows a set of test signals di{t),. . . , d5{t) that has been used to assess the various estima- 
tors. Note that the signals differ only in their frequency content. The test signals are highly nonlinear 
and generated so that the signal presents intervals in which is very slowly varying (low absolute value 
of the derivative) and intervals in which the derivative is higher. The choice of the parameter 7inax is 
based on the maximum cumulative bias, defined in Equation (3.5): Let Pb denote the desired power 
of the cumulative biases of the estimates. Since there are N nodes, wc consider the average power of 
the bias of each node as T = Pb/N. Assume that we want the estimator to guarantee that the power 
of the right-hand side of Equation (3.5) is equal to T. This is equivalent to 

_ Vr 
Vr + A 

In the simulations we have chosen T = dB, which is a rather low value, and the noise variance is 
fT^ = 1.5, which is quite a large value if compared to the amplitude of the signal to track. We also 
assume that he value of A is not known precisely but with an upper bound of about 5% from its real 
value. Therefore, these choices allow us to test the proposed estimator in the worst conditions. 

Wc have considered 30 geometric random graphs with TV = 20 nodes uniformly distributed in a 
square of side length equal to N/2. Two nodes are connected if their Euclidean distance is less than 
1.7VN. The average neighborhood size of all the considered networks is 6.6 nodes with a maximum and 
minimum neighborhood size of 11 and 3, respectively. The estimation of the test signals di{t), . . . , d5{t) 
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Figure 4: Test signals used in the simulations. Signal d2{t), . . . ,d5{t) are obtained from di{t) by 
changing the frequency. The test signals are highly nonlinear and are generated so that the signal 
shows intervals in which the derivative (in absolute value) is small and intervals in which the derivative 
is higher. 



has been performed under four different packet loss probabilities. More precisely we have consider the 
case in which qij = 0%, qij — 10% ± 5%, Qij = 20% ± 5% and qij = 30% ± 5%. We also considered 
different measurement noise realizations. We take the mean square error of the estimates of each 
node as performance measure. Each estimator has an initial transition phase, during which the mean 
square error may not be significant. Hence, it has been computed after 70 steps. Afterwards, the mean 
square error has been averaged over all nodes of the network. The average is denoted by MSE. We 
define the improvement factor of our estimator compared to the estimators Ei, . . . , as 

_ MSE(^,) - MSE(^p) 

MSE{E,) ' «-^>---4. 

Figure 5 shows the MSE(i?i) for all the five different estimators as function of the packet loss probability. 
In the simulation we assume the nodes know the threshold value before the estimation process starts. 
This is typically achieved after 10 time steps and thus the network experiences a short initialization 
phase. Recall also that in all the simulations related to the proposed estimator Ep, the error covariance 
matrix is estimated at each node and reset as described in the end of Section 3. Notice that Ep 
outperforms all other estimators for any considered packet loss probability. The two estimators E^ 
and E4 have performance closer to Ep. In particular E3 performs better when the signal is slow, whereas 
E4 has better performance when the signal is faster. This is motivated by the fact that the estimator 
E3 uses only one measurement and it is affected by a bias that depends on the derivative of signal 
d{t) (see Figure 6). The E3 performance improves as the packet loss probability increases. Indeed, 
the single local measurement is weighted more when packet losses occurs, i.e., previous estimates as 
lost, and thus the overall estimate becomes less biased. The other two heuristic estimators, E2 and 
E3, offer poor performance clearly in all the situations we have considered. Figure 7 shows that the 
local computation of the weights ki(t) with (3.9) yields a stable K(<). In particular, higher values 
of 7(K(i)) are experienced when the signal is slower, because A is small. Viceversa, lower values of 
7(K(i)) are obtained when A is large. This is explained by considering that when the signal is slow, 
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Figure 5: Mean Square Error (MSE) performance comparison among estimators for various packet 
loss probabilities q. Each plot is associated to one of the five test signals di(t), . . . , dc,{t), see Figure 4. 
The marker (*) refers to Ei, (0) refers to E2, (□) refers to E^, (•) refers to £'4 and (>) refers to 
the proposed estimator Ep. The vertical bars represent the variance of the MSE computed for the 30 
simulations. Notice that the proposed estimator Ep has a very low variance, thus showing its high 
robustness to packet loss and measurement noise. 
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Figure 6: Realizations of the signal to be tracked, c?2, as it is measured by all the iV = 20 nodes, with 
a packet loss probability q ~ 10% ± 5%. Notice that the proposed estimator Ep, visibly outperforms 
all the other estimators in term of variance. Notice also that the proposed estimator presents a small 
bias when the signal changes more rapidly. 
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Figure 7: Largest value of the singular value of the matrix K(i) in the 30 Monte Carlo simulation 
generated. The rows of K(t) are computed individually by each node using (3.9). The curves corre- 
spond to the five test signals di{t), . . . , d^{t), and they are ordered from the slowest signal to 
the fastest one {d^{t)). 



then it is better to weight more previous estimates (which means larger K(i) and hence larger 7(K(<:))) 
to achieve small variances of the estimation error. By the same argument, when the signal is fast, then 
it is better to weight less previous estimates. 



7 Conclusions and Future Work 

In this paper, we presented a decentralized cooperative estimation algorithm for tracking a time- 
varying signal using a wireless sensor network with lossy communication. A mathematical framework 
was proposed to design a filter, which run locally in each node of the network. Performance analysis 
was carried out for the distributed estimation algorithm in time varying communication networks with 
packet loss probabilities both non-identical and identical among the links, losses. We investigated 
how the estimation quality depends on packet loss probability, network size and average number of 
neighboring nodes. The theoretical analysis showed that the filter is stable, and the variance of 
the estimation error is bounded even in the presence of large packet loss probabilities. Numerical 
results illustrated the validity of our approach, which outperforms other estimators available from the 
literature. 

Future studies will be devoted to the extension of our design methodology to the case when models 
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of the signal to track arc available. Lossy communication links with memory will also be included. 
Furthermore, we plan to implement our distributed filter on real wireless sensor networks, thus exper- 
imentally checking the validity of our theoretical predictions. 
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